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We investigate eflicient methods for packets to navigate in complex networks. The packets are assumed to have 
memory, but no previous knowledge of the graph. We assume the graph to be indexed, i.e. every vertex is 
associated with a number (accessible to the packets) between one and the size of the graph. We test different 
schemes to assign indices and utilize them in packet navigation. Four different network models with very different 
topological characteristics are used for testing the schemes. We find that one scheme outperform the others, and 
has an efficiency close to the theoretical optimum. We discuss the use of indexed-graph navigation in peer-to-peer 
networking and other distributed information systems. 



I. INTRODUCTION 

The interplay between network structure and search dynam- 
ics has emerged as a busy sub-field of statistical network stud- 
ies (see e.g. Refs. (fl1:l9l: [lft[l3l :[l4b). Consider a simple graph 
G = (V, E) (where V is a set of n vertices and £ is a set of 
m edges — unordered pairs of vertices). Assume information 
packets travel from a source vertex i to a destination f. We as- 
sume the packages are myopic agents (at a given timestep they 
have access to information about the vertices in their neigh- 
borhood, but not more), have memory (so they can e.g. per- 
form a depth-first search) but no previous knowledge of the 
network. Let T{p) be the time for a packet p to travel between 
its source and destination. One commonly studied quantity of 
search efliciency is the expectation value of t, f , for randomly 
chosen s and f. In this work we attempt to find eflicient ways 
to index V and utilize these indices for packet navigation. 

We propose two schemes of indexing the vertices, and cor- 
responding methods for packet navigation. These schemes, 
along with two depth-first search methods (not using vertex 
indices for more than remembering the path) are examined on 
four network models. We will first present the indexing and 
search schemes, then the network models for testing the algo- 
rithms, and last numerical results. 



II. INDEXING AND SEARCH SCHEMES 

Now we turn to the schemes for assigning indices to the 
vertices and using them in search processes. Our two main 
schemes are both inspired by search trees. Packets first moves 
towards a root vertex r, then towards the destination. Unless 
the network really is a tree, this approach cannot be exact — a 
packet is not guaranteed to find the shortest way both from s 
to r and from r to t). However, as we will see, one can assign 
indices such that the search either from s to r, or from r to 
f is certain to be as short as possible. One of our schemes, 
ASD (accurate search up), will be such that the shortest up- 
ward search is guaranteed, the other, ASD (accurate search 
down), will have the shortest possible r to f search. 

On a technical note, V is a set of distinct elements and an 
indexing scheme is a bijection : V i-> [1, «]. In the remain- 
der of the text we will not explicitly distinguish / e V from 
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FIG. 1 Illustration of the ASD (panels (a)-(c)) and ASU (panels 
(d)-(f)) indexing and search schemes, (a) shows a search tree where 
a local search algorithm can find the shortest path from one vertex 
to another fast, (b) shows a network indexed by the ASD scheme. 
The tree used in the construction is identical to the one shown in (a). 
Panel (c) shows an ASD search from s to / (with t = 4). On the way 
from t to r the packet chooses the neighbor (of the current vertex) 
with lowest index, which here gives a longer route than the optimal 
((9, 10), (10, l)j. (d) shows a possible partition of branches of non- 
root vertices into classes of as similar size as possible (as done in the 
ASU indexing scheme), (e) shows a possible indexing based on the 
partition in (d). Panel (f) displays a search from i- to t with t = 6. 
The shortest path from ? to r is accurately found, but a detour to 6 
makes the search from r to f sub-optimal. 



A. The ASD indexing and search 

The numbers 1 , ■ ■ ■ , n can be arranged into a search tree 
such that the expected value of r scales like log n. In Fig. [Ha) 
we give an example of a search tree. To go from source s to 
destination t a packet first moves to the root r by going to the 
neighbor with lowest index value. From the root to the des- 
tination, the package moves to the neighbor with the largest 
index smaller than, or equal to, f . Our strategy for the ASD in- 
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dexing and search scheme is to construct a spanning tree T(G) 
for the network; index the tree to make it a search tree; and 
use the algorithm above to navigate from s to t. The problem 
is, however that real networks are not trees. Imagine adding 
edges between vertices of the same heights and branches to 
the tree in Fig.flla) — the tree will still be a spanning tree, but 
the packets may not take the same path from s to f any more. 
As we will see, with certain ways of constructing the tree and 
indexing the vertices the search, either from s to r or r to f will 
be optimal. 

We construct T(G) in the following way: 

1 . Let the root r be a vertex of smallest eccentricity (max- 
imal distance to an other vertex). 

2. Construct the tree such that the distances to the root is 
the same in T(G) and G. In other words, such that all 
edges in T go between different neighborhoods Tiir) - 
{i €.V : d(i, r) = I and r/+i(r)} for some level < I < h, 
where h is the height of the tree (by the choice of r, 
h is also the radius of the graph). Such a tree can be 
constructed by finding the set of followed edges in a 
breadth-first search (Is!) starting from r. 

When it is not clear which vertex, or edge, to choose in 
the above construction, we choose one at random from all 
the possible candidates. When T is constructed, let the in- 
dices be a preordering of the vertices in T (i.e. the order of 
first-occurrence of the vertex in a depth-first search of the 
graph) il). 

Now we prove that this indexing and search algorithm al- 
ways gives the shortest paths from the root to a vertex t. Let 
Et be the edges of T and let T, be the maximal subtree with 
i as root. By construction, all vertices in T, have indices in 
[/, / + \Ti\] (where | ■ | denotes the cardinality of a subgraph). 
Let /' be the largest index in /'s neighborhood smaller than t. 
Assume there is an edge (/, j) e E\ Ej that the search will fol- 
low, i.e. that /' < j < t. This means that j e r,v. By construc- 
tion, /' is the only vertex in r,v at a distance d{r, i') (the dis- 
tance from the rest of T,/ to the root is at least ii(r, /')+!). Since 
d{r, i') - d(r, i) + 1, we have d(r, j) > d(r, i) + 2 which contra- 
dicts the existence of an edge (/, j) e E. Thus searches from r 
to t will always follow the edges of T, which also means the 
r-f-searches will be as short as possible. 

Searching upwards, from / to r, in a graph indexed as above 
is harder We know that one shortest path goes via a vertex j 
with smaller index than ;, but there might sub-optimal paths 
via indices /' in the intervals r < i' < j and j < i' < i, and there 
might also be paths via vertices of index larger than j, that 
is optimal. For example, assume the search tree in Fig. [Ha) 
comes from a graph with the additional edges (5, 9), (8, 9) and 
(9, 10) (see Fig. [Tfb)). Then, the shortest path from 9 to r 
via a vertex of lower index is {(9, 7), (7, 1)}, but there is an 
equally long path via a vertex of larger index, {(9, 10), (10, 1)), 
and longer paths via vertices both smaller and larger than 7 
but smaller than 9. There thus no general way of finding the 
shortest way from s to r. Instead, we always choose the vertex 
with the smallest index in the neighborhood. By this strategy 
a packet will come closer to r, in index space, for every step. 



Furthermore, in tree-like parts of the graph, the search will 
follow the shortest paths. An illustration of the ASD search 
can be found in Fig. [He). 

B. The ASU indexing and search 

Consider a tree T{G) constructed as in the previous section 
and an indexing such that d{i, r) < d{j, r) implies / < j (i.e., all 
indices of a level further from the root is larger than in levels 
closer to r). With such an indexing, since the neighbor of a 
vertex with the smallest index necessarily is one step closer 
to the root, a packet can always find one shortest way too the 
root. But once the package is at the root the indices is not of 
so much help. The search from r to f has to be, essentially, a 
depth-first search. There are, however, a few tricks to speed 
up the search. First, there is no need to search deeper than 
f — if j > t, then t i T j. Second, one can choose the indices 
i, - ■ ■ ,i + |r/(r)| of one level in the tree in a way to narrow 
down the search. For example, one can divide the vertices 
into V classes (defined by e.g. the remainder when the index is 
divided by v) and index vertices of connected regions of the 
graph with indices of the same class. The search can then be 
restricted to the same class as the destination. We will pursue 
this idea with v = 2. 

To derive the ASU indexing scheme, the first goal is to di- 
vide the vertices into classes of connected subgraphs. Fur- 
thermore, we require all classes to be connected to the root 
vertex. Another aim is to make the classes of as similar sizes 
as possible. Our first step is to make kr (the degree, or number 
of neighbors, of r) parallel depth-first searches'. Second we 
group the kr search trees into v groups with maximally simi- 
lar sizes. In our case, we seek a partition of the search trees 
into two classes such that the sums of vertices in the respec- 
tive classes are as close as possible.^ Then we go through the 
levels, starting from the root, and assign numbers such that 
vertices of one partition have even indices, while the other has 
odd numbers (this assignment might not always work). To 
avoid systematic errors we sample the elements of levels ran- 
domly. This construction scheme is illustrated in Fig.fTfd) and 
(e). 

C. Degree-based and random search 

As a reference, we also run simulations for two depth-first 
search methods that do not utilize indices (Ij). One of them, 
Rnd, is regular depth-first search where the neighbors are tra- 
versed in random order. In the other, Deg, the neighbors are 



Every iteration, one step is taken in all branches. The different search 
branches marks the visited vertices with their indices. A search proceeds 
only to vertices not marked by any search. When there are no unmarked 
vertices, the search branch is finished. 
^ We do this by randomly exchanging search trees between the two classes 
and accept changes that improve the partition. The search is continued until 
their vertex-sums differ by at most one, or the partition has not improved 
for 1000 trials. 
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chosen in order from high to low degree. Just like for ASU 
and ASD methods, a packet is assumed to have knowledge 
about its neighborhood — if the destination is in the neighbor- 
hood of a vertex, then the search will be over the next time 
step. 

III. NETWORK MODELS 

The efficiency of our indexing and search schemes are more 
or less directly affected by the network structure. To inves- 
tigate this relationship we test the search schemes on four 
different types of network models: modified Erdos-Renyi 
(ER) graphs (H), square lattices, Barabasi-Albert (BA) (Q) and 
Holme-Kim (HK) (8) networks. To facilitate comparison, we 
have the same average degree, four (dictated by the square 
grid), in all networks. 

A. Modified ER graphs 

The ER model is the simplest model for randomly generat- 
ing simple graphs with n vertices and m edges. The edges are 
added one by one to randomly chosen vertex pairs (the only 
restriction being that loops or multiple edges are not allowed). 
A problem for our purpose is that ER graphs are not necessar- 
ily connected (something required to measure f). To remedy 
this we propose a scheme to make networks connected. 

1 . Detect the connected components. 

2. Go through the connected components sequentially. 
Denote the current component C/. 

(a) Pick a component Cj randomly. 

(b) Pick a random edge (/, f) whose removal would 
not fragment Cj. If no such edge exist, go to 
step|2] 

(c) Pick a random vertex /' of C/. 

(d) Replace by (f,j). If the edge {i',J) would 
exist already (an unlikely event), go to step|2d] If 
there is no vertex /' e C/ such that (;', j) does not 
already exist, then go to|2] 

3. If the network is disconnected still, go to step[T] 

In practice, even for our largest system sizes, the above al- 
gorithm converges in a few iterations. The number of edges 
needed to be added never exceed a few percent of m, and this 
addition is made with greatest possible randomness; hence we 
believe the essential network structure of the ER model is con- 
served. 



B. Square lattice 

We use square lattices with periodic boundary conditions. 
n vertices spread out regularly on a L x L-grid such that the 
vertex with coordinates (x, y), I < x,y < L, is connected to 



(x,y+l), (x+l,y), (x,y-l), (x-l,y) (ifx- 1, we formally let 
x-1 = L, if X = L we let jcH- 1 represent 1 ; and correspondingly 
fory). 

C. BA model 

The popular BA model (0) of networks with a power-law 
degree distribution works as follows (with our parameter set- 
tings). Start with one vertex connected to two degree-one ver- 
tices. Iteratively add vertices connected to two other vertices. 
Let the probability of connecting the new vertex to a vertex / 
already present in the network is proportional to ki (so called 
preferential attachment). 

D. HK model 

The HK model (js) is a modification of the BA model to 
give the network higher number of triangles. When edges are 
added from the new vertex to akeady present vertices, the first 
edge is added by preferential attachment. The second edge is 
added to one of /'s neighbors, forming a triangle. 

IV. NUMERICAL RESULTS 

We study the search schemes on the four different network 
topologies numerically. We use 100 independent networks 
and 100 different i-f -pairs for every network. The network 
sizes range from « = 16 to n = 16,384. 

In Fig. |2] we display the average search times as a func- 
tion of system size for our simulations. The most conspic- 
uous feature is that the ASD scheme is always, by far, the 
most efficient. While ASU and Deg are close to the least ef- 
ficient method (Rnd), ASD is rather close to the theoretical 
limit (equal to the average distances f — the upper border of 
the shaded areas in Fig.|2]i. To be more precise, f is quite con- 
stant, about two times larger than the average distance. The 
other search schemes (ASU, Deg and Rnd) follow faster in- 
creasing functional forms. For the square lattice, these three 
schemes increase, approximately proportional to n (the ana- 
lytical value for two-dimensional random walks) whereas for 
ASD, f scale like distances in square grids, n'''^. One way of 
interpreting this result is that while ASD manages to find the 
root as fast as it finds the destination from the root, ASU fails 
to find t faster than a random search. The slow downward per- 
formance of ASU is not unexpected — the r-f-search in ASU 
only differs from a random depth-first search in that it does 
not search further than the level of the destination, and that 
it restricts the search-space to half its original size by divid- 
ing the vertices into odd and even indices. The fast upward 
search of ASD is more surprising. In Fig. |3]we show a net- 
work where ASD performs badly. The average time to search 
upwards is (n^ + 2Qn - 13)/8n — > «/8 as n — > oo. The down- 
ward search takes 3(n - l)/2n ~ 3/2 giving a total expected 
value of f ~ «/8. This can be compared to the average dis- 
tance d - 3 - 2l/4n + 2/n^ ~ 3. For this example, f and 
d diverge in a way not seen in the network models. Why is 
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FIG. 2 The average search time f as a function of the graph sizes 
n. In all panels, we display data for the different indexing and search 
schemes. The shaded areas are unreachable (corresponding to f val- 
ues smaller than the theoretical minimum — the average distance d). 
The different panels correspond to the modified ER model, square 
grid, BA model and HK model networks respectively. Error bars 
would have been smaller than the symbol sizes. 




FIG. 3 A worst case scenario for navigating from s to r with the 
ASD indexing and search scheme. A packet from « - 2 to 1 will 
travel along the perimeter to 3 and then move towards the center. 



the search so much faster in the model networks? One point 
is that the worst-case indexing seen in Fig.[3]is very unHkely. 
Since the spokes would be sampled randomly, the chance that 
a vertex at the perimeter not finds r in two steps is 1 /2, the 
probability of a perimeter vertex to find r in 3 steps is 1 /4, and 
so on. Carrying on this calculation, a vertex at the perimeter 
reaches r in 2 2^. A;2* + 2 ~ 6 timesteps giving f ~ 5 — not too 
far from the observed f/d ~ 2. We note however that for the 
model networks many other factors that are not present in the 
wheel-graph of Fig. |3]afFect f. For example, the high density 
of short triangles in the HK model networks will introduce 
many edges between vertices of the same level in T{G) which 
will aff'ect the search efficiency. 

f is approximately linear for the ASU, Deg and Rnd on all 
network models. The slopes of these curves are, however, a 
little different. First, the Deg method is more efficient (com- 
pared to ASU and Rnd) for BA networks, than for the modi- 
fied ER model. This observation (also made in Ref. (1)) can 
be explained by the skewed degree distribution in the BA- 
network — the packet reaches high-degree vertices fast. The 
packet can see a large part of the network from these hubs, and 
is therefore more likely to see t. More interesting, perhaps, is 
that ASU is more efficient for the networks with a higher den- 
sity of short cycles (the square lattice and HK models). A 
rough explanation is that the partition procedure of ASU cuts 
off many edges between vertices at the same distance from 
r. Since there are many such edges in these network models, 
the network will effectively be sparser (without changing G's 
diameter), which results in a better performance. 



V. DISCUSSION 

We have investigated navigation in valued graphs, more 
specifically in indexed graphs — graphs where every vertex is 
associated with a unique number in the interval These 
indices can be assigned to facilitate the packet navigation. The 
packets are assumed to have no a priori knowledge about the 
network, except the neighborhoods of their current positions, 
but memory enough to perform a depth-first search. We find 
that one of our investigated methods, ASD, is very efficient 
for four topologically very different network models. The 
searches with the ASD scheme are roughly twice as long as 
the shortest paths (scaling in the same way as the average dis- 
tance). 

Navigation on indexed graphs has applications in dis- 
tributed information systems. If, specifically, the amount of 
information that can be stored at the vertices were limited, 
search strategies such as ours would be useful. One such sys- 
tem is the Autonomous System level Internet where the in- 
formation stored at each vertex (with the current protocols) 
increase at least as fast as the networks themselves. For 
most real-world applications (other examples being ad hoc 
networks(4) or peer-to-peer networks (SItEIT^)) there are ad- 
ditional constraints so that the algorithms of this paper cannot 
immediately be applied. Such networks are typically chang- 
ing over time, so the indexing should ideally be possible to be 
extended on the fly as vertices and edges are added and deleted 
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from the network. Apart from this, a future direction for re- 
search on indexed graphs is to improve the performance of the 
algorithms presented in this work. There might be search-tree 
based algorithm that neither finds the shortest path to the root, 
nor finds the shortest way to the destination. For some net- 
work topologies there might be faster algorithms that are not 
based on constructing a spanning tree. Consider, for example, 
modular networks (11) (i.e. networks with tightly connected 
subgraphs that are only sparsely interconnected) in such net- 
works the search can be divided into two stages — first find 
the cluster of the destination, then the destination. These two 
stages should be reflected in a fast navigation algorithm. 
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